Reproducible Workflows

Spring 2026

Reproducibility

Reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis.

Step 1: Writing (and getting) Results

Markdown Languages

A format that combines code, results, and commentary into a single file

Markup files are designed for:

  • Communicating to those who want to focus on the conclusions and not the code (advisers, clients, me)

  • Collaborating with other data data scientists who are interested in the conclusions and the code (your peers)

  • Used as a modern day lab notebook where you can document 1) what you did, 2) what you were thinking, and 3) how you got there.

Examples include: HTML, LaTex, Markdown

Quarto

  • Built into the Rstudio environment
  • Successor to RMarkdown
  • Words + Code
  • Can edit with Markdown code OR a visual editor

Quarto Demo

Version Control

We’ve all been here…

Git vs. Github

  • Git allows us to keep track of different versions of file(s)
  • Github is a website where we can upload and monitor these tracked changes.

In this class…

  1. You will all have a complete Git workflow
    • Please send username to me.
  2. We will have a class Git organization
  3. You will each have weekly repositories. I will upload these the weekend before.
  4. You will each clone repos from Github.
    • This will create an Rproject (.rproj)
    • This goes into a folder on your machine.
    • My advice, have a ‘mother’ folder, and a weekly folder
  5. Each week you may have notes (.r file), quarto files (qmd/html), or others. Each day you will commit and push any changes.
    • Pull every class just in case anything may have changed online. Then continue… make commits… push… repeat
  6. All files will be turned in, in this manner.

(Y)ou are a real, valid, competent user and programmer no matter what IDE you develop in or what tools you use to make your work work for you